3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

Size: px
Start display at page:

Download "3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No."

Transcription

1 7. LEAST SQUARES ESTIMATION 1 EXERCISE: Least-Squares Estimation and Uniqueness of Estimates 1. For n real numbers a 1,...,a n, what value of a minimizes the sum of squared distances from a to each of the a i : n i=1 (a i a) 2? (prove) 2. Here are two datasets (given as (x,y)). For each dataset: Sketch a scatterplot of the data. What is the least squares line y i = β 0 +β 1 x i +ɛ i? That is, what is the line that minimizes the residual sum of squares. What is Ŷ? What is ˆβ? Dataset A: {(1,1),(1,2),(1,3),(1,5)}. Dataset B: {(1,1),( 1,2),(1,3),( 1,5)}. 3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

2 2 7. LEAST SQUARES ESTIMATION 7.1 Least Squares Estimators Recall the linear model: Y 1 Y 2. Y n = Y = Xβ + ε x 10 x 11 x 1,p 1 x 20 x 21 x 2,p x n0 x n1 x n,p 1 β 0 β 1. β p 1 + ε 1 ε 2. ε n Definition: An estimate ˆβ is a least-squares estimate of β if it minimizes the length Y Xβ over all β. Note: least-squares is a mathematical criterion, not a statistical criterion Let x 0,x 1,...,x p 1 be the columns of X. Then β 0 Xβ = ( ) β x 0 x 1 x 1 p 1. β p 1 = β 0 x 0 + β 1 x β p 1 x p 1 R(X), the range (column space) of X. Questions: Why do we say a least-squares estimate instead of the least-squares estimate? If there is more than one leastsquares estimate, what is the geometric interpretation?

3 7. LEAST SQUARES ESTIMATION 3 A least-squares estimate can be found by finding a solution to the following minimization problem: 7.2 Orthogonal Projection Minimize Y θ over θ R(X). Lemma 7.2.1: Y can be uniquely decomposed as Y = Ŷ + ˆε where Ŷ R(X), ε [R(X)], [R(X)] = orthogonal complement of R(X) = {a : X a = 0} Definition: Ŷ is the orthogonal projection of Y onto R(X). It is also called the fitted vector or vector of fitted values. Y ˆε Ŷ

4 4 7. LEAST SQUARES ESTIMATION Proof: Existence: There must be one such decomposition because R(X) and [R(X)] span R n. Uniqueness: Suppose Y = Ŷ1 + ˆε 1, and Y = Ŷ2 + ˆε 2. then Ŷ1 Ŷ2 + ˆε 1 ˆε 2 = 0. Taking the inner product of this vector, we obtain 0 = (Ŷ1 Ŷ2 + ˆε 1 ˆε 2 ) (Ŷ1 Ŷ2 + ˆε 1 ˆε 2 ) = Ŷ1 Ŷ2 2 + ˆε 1 ˆε 2 ) (Ŷ1 }{{ Ŷ2) } (ˆε 1 ˆε 2 ) }{{} R(X) [R(X)] = Ŷ1 Ŷ2 2 + ˆε 1 ˆε 2 2 so that Ŷ1 Ŷ2 = 0 and ˆε 1 ˆε 2 = 0.

5 7. LEAST SQUARES ESTIMATION 5 Lemma 7.2.2: The orthogonal projection solves the least-squares minimization problem. Proof: For any θ R(X), (Y Ŷ) (Ŷ θ) = 0. Therefore, Y θ 2 = Y Ŷ + Ŷ θ 2 which is minimized by θ = Ŷ. = Y Ŷ 2 + Ŷ θ 2, Y Y Ŷ Ŷ θ We have just established that the vector in R(X) that is closest to Y ( closest according to least-squares) is the projection of Y onto R(X).

6 6 7. LEAST SQUARES ESTIMATION 7.3. Normal Equations Since Y Ŷ [R(X)], we know that This implies that X (Y Ŷ) = 0. X Y = X Ŷ. Since Ŷ R(X), we can write Ŷ = Xˆβ. So we have We have just proved: X Y = X Xˆβ. Lemma 7.3.1: A least squares estimate of β, denoted ˆβ, is a solution to the normal equations: X Xˆβ = X Y. Note: An alternative derivation of the normal equations uses derivatives to find a minimum of Y Xβ (Seber & Lee, p. 38).

7 7. LEAST SQUARES ESTIMATION Residual Vector Definition: The residual vector is ˆε = Y Ŷ = Y Xˆβ. Definition: The residual sum of squares is defined by RSS = ˆε ˆε n = i=1 ˆɛ 2 i = (Y Xˆβ) (Y Xˆβ)

8 8 7. LEAST SQUARES ESTIMATION 7.5. The Full Rank Case If rank(x n p ) = p, then X has full rank (largest possible assuming p n). Then rank(x X) = p (Seber & Lee, A2.4) so (X X) 1 exists. In this case the normal equations have the unique solution ˆβ = (X X) 1 X Y. The orthogonal projection (fitted vector) is Ŷ = Xˆβ = X(X X) 1 X Y = PY, where P = X(X X) 1 X. Note: P is sometimes called the hat matrix because PY = Ŷ. It is a projection matrix and it projects Y onto R(X). Lemma 7.5.1: Let P = X(X X) 1 X where X has full rank. (i) P and I P are projection matrices. (ii) rank(i P) = tr(i P) = n p. (iii) PX = X. Interpretation: P is projection onto R(X). I P is projection onto [R(X)]. For the residual vector we have: ˆε = Y Ŷ = Y PY = (I P)Y (Note : ˆε [R(X)] ), and for the residual sum of squares we can write: RSS = ˆε ˆε = Y (I P)Y.

9 7. LEAST SQUARES ESTIMATION The Less-Than-Full Rank Case Lemma: Let rank(x) = r < p and P = X(X X) X where (X X) is a generalized inverse of X X. Then (i) P and I P are projection matrices. (ii) rank(i P) = tr(i P) = n r. (iii) X (I P) = 0. Sketch of Proof: There is a unique matrix P such that ˆθ = PY (see Seber & Lee B1.2). One representation for P is P = X 1 (X 1X 1 ) 1 X 1 where X 1 consists of r linearly independent columns X. (i) Show P is idempotent and symmetric and therefore a projection matrix. P = X 1 (X 1 X 1) 1 X 1 = P2 = P (ii) rank(i P) = tr(i P) because I P is a projection matrix. But tr(i P) = tr(i) tr(p) = n tr(p), tr(p) = tr[x 1 (X 1 X 1) 1 X 1 ] = tr[(x 1 X 1) 1 X 1 X 1] = tr(i r r ) = r. (iii) This is equivalent to (I P)X = 0, or PX = X. This is clearly true since Px j = x j for every column of X, because x j R(X).

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance: 8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Well-developed and understood properties

Well-developed and understood properties 1 INTRODUCTION TO LINEAR MODELS 1 THE CLASSICAL LINEAR MODEL Most commonly used statistical models Flexible models Well-developed and understood properties Ease of interpretation Building block for more

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Lecture 05 Geometry of Least Squares

Lecture 05 Geometry of Least Squares Lecture 05 Geometry of Least Squares 16 September 2015 Taylor B. Arnold Yale Statistics STAT 312/612 Goals for today 1. Geometry of least squares 2. Projection matrix P and annihilator matrix M 3. Multivariate

More information

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij = 20. ONE-WAY ANALYSIS OF VARIANCE 1 20.1. Balanced One-Way Classification Cell means parametrization: Y ij = µ i + ε ij, i = 1,..., I; j = 1,..., J, ε ij N(0, σ 2 ), In matrix form, Y = Xβ + ε, or 1 Y J

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Estimable Functions and Their Least Squares Estimators. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51

Estimable Functions and Their Least Squares Estimators. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51 Estimable Functions and Their Least Squares Estimators Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 51 Consider the GLM y = n p X β + ε, where E(ε) = 0. p 1 n 1 n 1 Suppose

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58 ML and REML Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 58 Suppose y = Xβ + ε, where ε N(0, Σ) for some positive definite, symmetric matrix Σ.

More information

MTH 2310, FALL Introduction

MTH 2310, FALL Introduction MTH 2310, FALL 2011 SECTION 6.2: ORTHOGONAL SETS Homework Problems: 1, 5, 9, 13, 17, 21, 23 1, 27, 29, 35 1. Introduction We have discussed previously the benefits of having a set of vectors that is linearly

More information

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections Section 6. 6. Orthogonal Sets Orthogonal Projections Main Ideas in these sections: Orthogonal set = A set of mutually orthogonal vectors. OG LI. Orthogonal Projection of y onto u or onto an OG set {u u

More information

Orthogonal Complements

Orthogonal Complements Orthogonal Complements Definition Let W be a subspace of R n. If a vector z is orthogonal to every vector in W, then z is said to be orthogonal to W. The set of all such vectors z is called the orthogonal

More information

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X = The Gauss-Markov Linear Model y Xβ + ɛ y is an n random vector of responses X is an n p matrix of constants with columns corresponding to explanatory variables X is sometimes referred to as the design

More information

Reference: Davidson and MacKinnon Ch 2. In particular page

Reference: Davidson and MacKinnon Ch 2. In particular page RNy, econ460 autumn 03 Lecture note Reference: Davidson and MacKinnon Ch. In particular page 57-8. Projection matrices The matrix M I X(X X) X () is often called the residual maker. That nickname is easy

More information

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options.

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options. 1 Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam Name: Problems do not have equal value and some problems will take more time than others. Spend your time wisely. You do not

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27 Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix

More information

Distributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31

Distributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31 Distributions of Quadratic Forms Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 31 Under the Normal Theory GMM (NTGMM), y = Xβ + ε, where ε N(0, σ 2 I). By Result 5.3, the NTGMM

More information

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options.

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options. 1 Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam Name: KEY Problems do not have equal value and some problems will take more time than others. Spend your time wisely. You do

More information

Announcements Monday, November 20

Announcements Monday, November 20 Announcements Monday, November 20 You already have your midterms! Course grades will be curved at the end of the semester. The percentage of A s, B s, and C s to be awarded depends on many factors, and

More information

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2) RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made

More information

2.7 Estimation with linear Restriction

2.7 Estimation with linear Restriction Proof (Method 1: show that that a C(W T ), which implies that the GLSE is an estimable function under the old model is also an estimable function under the new model; secnd show that E[a T ˆβ G ] = a T

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

A Note on UMPI F Tests

A Note on UMPI F Tests A Note on UMPI F Tests Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico May 22, 2015 Abstract We examine the transformations necessary for establishing

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

STAT 151A: Lab 1. 1 Logistics. 2 Reference. 3 Playing with R: graphics and lm() 4 Random vectors. Billy Fang. 2 September 2017

STAT 151A: Lab 1. 1 Logistics. 2 Reference. 3 Playing with R: graphics and lm() 4 Random vectors. Billy Fang. 2 September 2017 STAT 151A: Lab 1 Billy Fang 2 September 2017 1 Logistics Billy Fang (blfang@berkeley.edu) Office hours: Monday 9am-11am, Wednesday 10am-12pm, Evans 428 (room changes will be written on the chalkboard)

More information

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections Math 6 Lecture Notes: Sections 6., 6., 6. and 6. Orthogonal Sets and Projections We will not cover general inner product spaces. We will, however, focus on a particular inner product space the inner product

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

Projection. Ping Yu. School of Economics and Finance The University of Hong Kong. Ping Yu (HKU) Projection 1 / 42

Projection. Ping Yu. School of Economics and Finance The University of Hong Kong. Ping Yu (HKU) Projection 1 / 42 Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Projection 1 / 42 1 Hilbert Space and Projection Theorem 2 Projection in the L 2 Space 3 Projection in R n Projection

More information

FIRST MIDTERM EXAM ECON 7801 SPRING 2001

FIRST MIDTERM EXAM ECON 7801 SPRING 2001 FIRST MIDTERM EXAM ECON 780 SPRING 200 ECONOMICS DEPARTMENT, UNIVERSITY OF UTAH Problem 2 points Let y be a n-vector (It may be a vector of observations of a random variable y, but it does not matter how

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Estimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17

Estimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17 Estimating Estimable Functions of β Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7 The Response Depends on β Only through Xβ In the Gauss-Markov or Normal Theory Gauss-Markov Linear

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Ridge Regression. Flachs, Munkholt og Skotte. May 4, 2009

Ridge Regression. Flachs, Munkholt og Skotte. May 4, 2009 Ridge Regression Flachs, Munkholt og Skotte May 4, 2009 As in usual regression we consider a pair of random variables (X, Y ) with values in R p R and assume that for some (β 0, β) R +p it holds that E(Y

More information

Constraints on Solutions to the Normal Equations. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

Constraints on Solutions to the Normal Equations. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41 Constraints on Solutions to the Normal Equations Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 41 If rank( n p X) = r < p, there are infinitely many solutions to the NE X Xb

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

Lecture 19 Multiple (Linear) Regression

Lecture 19 Multiple (Linear) Regression Lecture 19 Multiple (Linear) Regression Thais Paiva STA 111 - Summer 2013 Term II August 1, 2013 1 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013 Lecture Plan 1 Multiple regression

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

STAT 8260 Theory of Linear Models Lecture Notes

STAT 8260 Theory of Linear Models Lecture Notes STAT 8260 Theory of Linear Models Lecture Notes Classical linear models are at the core of the field of statistics, and are probably the most commonly used set of statistical techniques in practice. For

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.4 The Projection Matrix 1 Chapter 6. Orthogonality 6.4 The Projection Matrix Note. In Section 6.1 (Projections), we projected a vector b R n onto a subspace W of R n. We did so by finding a basis for

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Multiple linear regression: estimation and model fitting

Multiple linear regression: estimation and model fitting Multiple linear regression: estimation and model fitting January 25 Introduction The goal of today s class is to set up a multiple regression model in terms of matrices and then solve for the regression

More information

Math Linear Algebra

Math Linear Algebra Math 220 - Linear Algebra (Summer 208) Solutions to Homework #7 Exercise 6..20 (a) TRUE. u v v u = 0 is equivalent to u v = v u. The latter identity is true due to the commutative property of the inner

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, 2. REVIEW OF LINEAR ALGEBRA 1 Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, where Y n 1 response vector and X n p is the model matrix (or design matrix ) with one row for

More information

The geometry of least squares

The geometry of least squares The geometry of least squares We can think of a vector as a point in space, where the elements of the vector are the coordinates of the point. Consider for example, the following vector s: t = ( 4, 0),

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition

More information

General Linear Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 35

General Linear Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 35 General Linear Test of a General Linear Hypothesis Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 35 Suppose the NTGMM holds so that y = Xβ + ε, where ε N(0, σ 2 I). opyright

More information

Multiple Regression Model: I

Multiple Regression Model: I Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,

More information

Likelihood Ratio Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 42

Likelihood Ratio Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 42 Likelihood Ratio Test of a General Linear Hypothesis Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 42 Consider the Likelihood Ratio Test of H 0 : Cβ = d vs H A : Cβ d. Suppose

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Lecture 13: Orthogonal projections and least squares (Section ) Thang Huynh, UC San Diego 2/9/2018

Lecture 13: Orthogonal projections and least squares (Section ) Thang Huynh, UC San Diego 2/9/2018 Lecture 13: Orthogonal projections and least squares (Section 3.2-3.3) Thang Huynh, UC San Diego 2/9/2018 Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can

More information

and u and v are orthogonal if and only if u v = 0. u v = x1x2 + y1y2 + z1z2. 1. In R 3 the dot product is defined by

and u and v are orthogonal if and only if u v = 0. u v = x1x2 + y1y2 + z1z2. 1. In R 3 the dot product is defined by Linear Algebra [] 4.2 The Dot Product and Projections. In R 3 the dot product is defined by u v = u v cos θ. 2. For u = (x, y, z) and v = (x2, y2, z2), we have u v = xx2 + yy2 + zz2. 3. cos θ = u v u v,

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

(a) Compute the projections of vectors [1,0,0] and [0,1,0] onto the line spanned by a Solution: The projection matrix is P = 1

(a) Compute the projections of vectors [1,0,0] and [0,1,0] onto the line spanned by a Solution: The projection matrix is P = 1 6 [3] 3. Consider the plane S defined by 2u 3v+w = 0, and recall that the normal to this plane is the vector a = [2, 3,1]. (a) Compute the projections of vectors [1,0,0] and [0,1,0] onto the line spanned

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

STA 2101/442 Assignment Four 1

STA 2101/442 Assignment Four 1 STA 2101/442 Assignment Four 1 One version of the general linear model with fixed effects is y = Xβ + ɛ, where X is an n p matrix of known constants with n > p and the columns of X linearly independent.

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares MATH 167: APPLIED LINEAR ALGEBRA Least-Squares October 30, 2014 Least Squares We do a series of experiments, collecting data. We wish to see patterns!! We expect the output b to be a linear function of

More information

On V-orthogonal projectors associated with a semi-norm

On V-orthogonal projectors associated with a semi-norm On V-orthogonal projectors associated with a semi-norm Short Title: V-orthogonal projectors Yongge Tian a, Yoshio Takane b a School of Economics, Shanghai University of Finance and Economics, Shanghai

More information

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28 2. A Review of Some Key Linear Models Results Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics 510 1 / 28 A General Linear Model (GLM) Suppose y = Xβ + ɛ, where y R n is the response

More information

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

MATH 22A: LINEAR ALGEBRA Chapter 4

MATH 22A: LINEAR ALGEBRA Chapter 4 MATH 22A: LINEAR ALGEBRA Chapter 4 Jesús De Loera, UC Davis November 30, 2012 Orthogonality and Least Squares Approximation QUESTION: Suppose Ax = b has no solution!! Then what to do? Can we find an Approximate

More information

Statistical Machine Learning, Part I. Regression 2

Statistical Machine Learning, Part I. Regression 2 Statistical Machine Learning, Part I Regression 2 mcuturi@i.kyoto-u.ac.jp SML-2015 1 Last Week Regression: highlight a functional relationship between a predicted variable and predictors SML-2015 2 Last

More information

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A ) Econ 60 Matrix Differentiation Let a and x are k vectors and A is an k k matrix. a x a x = a = a x Ax =A + A x Ax x =A + A x Ax = xx A We don t want to prove the claim rigorously. But a x = k a i x i i=

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Foundation of Intelligent Systems, Part I. Regression 2

Foundation of Intelligent Systems, Part I. Regression 2 Foundation of Intelligent Systems, Part I Regression 2 mcuturi@i.kyoto-u.ac.jp FIS - 2013 1 Some Words on the Survey Not enough answers to say anything meaningful! Try again: survey. FIS - 2013 2 Last

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations

For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations 1 Generalised Inverse For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations (X T X)β = X T y (2) If rank of X, denoted

More information

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37 Scheffé s Method opyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 37 Scheffé s Method: Suppose where Let y = Xβ + ε, ε N(0, σ 2 I). c 1β,..., c qβ be q estimable functions, where

More information

The Geometry of Linear Regression

The Geometry of Linear Regression Chapter 2 The Geometry of Linear Regression 21 Introduction In Chapter 1, we introduced regression models, both linear and nonlinear, and discussed how to estimate linear regression models by using the

More information

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4 Linear Systems Math Spring 8 c 8 Ron Buckmire Fowler 9 MWF 9: am - :5 am http://faculty.oxy.edu/ron/math//8/ Class 7 TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5. Summary

More information

STA 302f16 Assignment Five 1

STA 302f16 Assignment Five 1 STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove

More information

Regression #2. Econ 671. Purdue University. Justin L. Tobias (Purdue) Regression #2 1 / 24

Regression #2. Econ 671. Purdue University. Justin L. Tobias (Purdue) Regression #2 1 / 24 Regression #2 Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #2 1 / 24 Estimation In this lecture, we address estimation of the linear regression model. There are many objective functions

More information

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A = Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write

More information

Econ 2120: Section 2

Econ 2120: Section 2 Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

IV. Matrix Approximation using Least-Squares

IV. Matrix Approximation using Least-Squares IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

6. Orthogonality and Least-Squares

6. Orthogonality and Least-Squares Linear Algebra 6. Orthogonality and Least-Squares CSIE NCU 1 6. Orthogonality and Least-Squares 6.1 Inner product, length, and orthogonality. 2 6.2 Orthogonal sets... 8 6.3 Orthogonal projections... 13

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information